Mining Scientific Terms and their Definitions: A Study of the ACL Anthology
نویسندگان
چکیده
This paper presents DefMiner, a supervised sequence labeling system that identifies scientific terms and their accompanying definitions. DefMiner achieves 85% F1 on a Wikipedia benchmark corpus, significantly improving the previous state-of-the-art by 8%. We exploit DefMiner to process the ACL Anthology Reference Corpus (ARC) – a large, real-world digital library of scientific articles in computational linguistics. The resulting automatically-acquired glossary represents the terminology defined over several thousand individual research articles. We highlight several interesting observations: more definitions are introduced for conference and workshop papers over the years and that multiword terms account for slightly less than half of all terms. Obtaining a list of popular defined terms in a corpus of computational linguistics papers, we find that concepts can often be categorized into one of three categories: resources, methodologies and evaluation metrics.
منابع مشابه
A suggested Motivational Method for Teaching Scientific Terminology, With a Practical Example
Using a reductionist approach, the motivational method for teaching scientific terminology aims at breaking down terms and their definitions into separate components, i.e. morphemes and their semantic features, rather than establishing a connection between terms and their definitions as holistic units. In other words, the ultimate goal of this method is achieving “semantic motivation,” (as oppo...
متن کاملTracing Research Paradigm Change Using Terminological Methods. A Case Study on "Machine Translation" in the ACL Anthology Reference Corpus
This paper explores the use of terminology extraction methods for detecting paradigmatic changes in scientific articles. We use a statistical method for identifying salient nouns and adjectives that signal these paradigmatic changes. We then employ the extracted lexical units for discovering terms that are assumed to be central in characterising paradigm shifts. To assess the method’s performan...
متن کاملArgumentative analysis of the ACL Anthology (Analyse argumentative du corpus de l'ACL (ACL Anthology)) [in French]
This paper presents an application of Text Zoning to the ACL Anthology. Text Zoning is known to be useful to characterize the content of papers, especially in the scientific domain. We show that recent techniques based on weakly supervised learning obtain excellent results on the ACL Anthology. Although these kinds of techniques is known in the domain, it is the first time it is applied to the ...
متن کاملThe ACL Anthology Searchbench
We describe a novel application for structured search in scientific digital libraries. The ACL Anthology Searchbench is meant to become a publicly available research tool to query the content of the ACL Anthology. The application provides search in both its bibliographic metadata and semantically analyzed full textual content. By combining these two features, very efficient and focused queries ...
متن کاملIntegrating User-Generated Content in the ACL Anthology
The ACL Anthology was revamped in 2012 to its second major version, encompassing faceted navigation, social media use, as well as authorand reader-generated content and comments on published work as part of the revised frontend user interface. At the backend, the Anthology was updated to incorporate its publication records into a database. We describe the ACL Anthology’s previous legacy, redesi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013